本文提出了一种新的方法,该方法结合了卷积层(CLS)和大规模的度量度量,用于在小数据集上进行培训模型以进行纹理分类。这种方法的核心是损失函数,该函数计算了感兴趣的实例和支持向量之间的距离。目的是在迭代中更新CLS的权重,以学习一类之间具有较大利润的表示形式。每次迭代都会产生一个基于这种表示形式的支持向量表示的大细边缘判别模型。拟议方法的优势W.R.T.卷积神经网络(CNN)为两倍。首先,由于参数数量减少,与等效的CNN相比,它允许用少量数据进行表示。其次,自返回传播仅考虑支持向量以来,它的培训成本较低。关于纹理和组织病理学图像数据集的实验结果表明,与等效的CNN相比,所提出的方法以较低的计算成本和更快的收敛性达到了竞争精度。
translated by 谷歌翻译
尽管高容量计算平台的可用性日益增长,但实施复杂性仍然是神经网络现实部署的重要问题。这种关注并不仅仅是由于最先进的网络体系结构的巨大成本,也是由于最近朝着边缘智能和嵌入式应用中使用神经网络的使用。在这种情况下,网络压缩技术由于能够降低部署成本的能力,同时将推断准确性保持在令人满意的水平,因此引起了兴趣。本文致力于开发针对神经网络的新型压缩方案。为此,首先开发了一种新的$ \ ell_0 $ -norm正规化方法,该方法能够在培训期间诱导网络中的强烈稀疏性。然后,可以通过修剪技术来瞄准训练有素的网络的较小权重,可以获得较小但高效的网络。提出的压缩方案还涉及使用$ \ ell_2 $ -Norm正则化以避免过度拟合以及进行微调以提高修剪网络的性能。提出了实验结果,目的是显示拟议方案的有效性,并与竞争方法进行比较。
translated by 谷歌翻译
在这项工作中,研究了来自磁共振图像的脑年龄预测的深度学习技术,旨在帮助鉴定天然老化过程的生物标志物。生物标志物的鉴定可用于检测早期神经变性过程,以及预测与年龄相关或与非年龄相关的认知下降。在这项工作中实施并比较了两种技术:应用于体积图像的3D卷积神经网络和应用于从轴向平面的切片的2D卷积神经网络,随后融合各个预测。通过2D模型获得的最佳结果,其达到了3.83年的平均绝对误差。 - Neste Trabalho S \〜AO InvestigaDAS T \'Ecnicas de Aprendizado Profundo Para a previ \ c {c} \〜ate daade脑电站a partir de imagens de resson \ ^ ancia magn \'etica,Visando辅助Na Identifica \ c {C} \〜AO de BioMarcadores Do Processo Natural de Envelhecimento。一个identifica \ c {c} \〜ao de bioMarcarcores \'e \'util para a detec \ c {c} \〜ao de um processo neurodegenerativo em Est \'Agio无数,Al \'em de possibilitar Prever Um decl 'inio cognitivo relacionado ou n \〜ao \`一个懒惰。 Duas T \'ECICAS S \〜AO ImportyAdas E Comparadas Teste Trabalho:Uma Rede神经卷应3D APLICADA NA IMAGEM VOLUM \'ETRICA E UME REDE神经卷轴2D APLICADA A FATIAS DO PANIAS轴向,COM后面fus \〜AO DAS PREDI \ C {c} \ \ oes个人。 o Melhor ResultAdo Foi optido Pelo Modelo 2D,Que Alcan \ C {C} OU UM ERRO M \'EDIO ABSOLUTO DE 3.83 ANOS。
translated by 谷歌翻译
自动语音识别(ASR)是一个复杂和具有挑战性的任务。近年来,该地区出现了重大进展。特别是对于巴西葡萄牙语(BP)语言,在2020年的下半年,有大约376小时的公众可供ASR任务。在2021年初发布新数据集,这个数字增加到574小时。但是,现有资源由仅包含读取和准备的演讲的Audios组成。缺少数据集包括自发性语音,这在不同的ASR应用中是必不可少的。本文介绍了Coraa(注释Audios语料库)V1。使用290.77小时,在包含验证对(音频转录)的BP中ASR的公共可用数据集。科拉还含有欧洲葡萄牙音像(4.69小时)。我们还提供了一个基于Wav2VEC 2.0 XLSR-53的公共ASR模型,并通过CoraA进行微调。我们的模型在CoraA测试集中实现了24.18%的单词误差率,并且在常见的语音测试集上为20.08%。测量字符错误率时,我们分别获得11.02%和6.34%,分别为CoraA和常见声音。 Coraa Corpora在自发言论中与BP中的改进ASR模型进行了组装,并激励年轻研究人员开始研究葡萄牙语的ASR。所有Corpora都在CC By-NC-ND 4.0许可证下公开提供Https://github.com/nilc-nlp/coraa。
translated by 谷歌翻译
已经证明了深度学习技术在各种任务中有效,特别是在语音识别系统的发展中,即旨在以一系列写词中的音频句子转录音频句子的系统。尽管该地区进展,但语音识别仍然可以被认为是困难的,特别是对于缺乏可用数据的语言,例如巴西葡萄牙语(BP)。从这个意义上讲,这项工作介绍了仅使用打开可用的音频数据的公共自动语音识别(ASR)系统的开发,从Wav2Vec 2.0 XLSR-53模型的微调,在许多语言中,通过BP数据进行了多种。最终模型在7个不同的数据集中呈现12.4%的平均误差率(在应用语言模型时10.5%)。根据我们的知识,这是开放ASR系统中BP的最佳结果。
translated by 谷歌翻译
已经广泛地研究了使用虹膜和围眼区域作为生物特征,主要是由于虹膜特征的奇异性以及当图像分辨率不足以提取虹膜信息时的奇异区域的使用。除了提供有关个人身份的信息外,还可以探索从这些特征提取的功能,以获得其他信息,例如个人的性别,药物使用的影响,隐形眼镜的使用,欺骗等。这项工作提出了对为眼部识别创建的数据库的调查,详细说明其协议以及如何获取其图像。我们还描述并讨论了最受欢迎的眼镜识别比赛(比赛),突出了所提交的算法,只使用Iris特征和融合虹膜和周边地区信息实现了最佳结果。最后,我们描述了一些相关工程,将深度学习技术应用于眼镜识别,并指出了新的挑战和未来方向。考虑到有大量的眼部数据库,并且每个人通常都设计用于特定问题,我们认为这项调查可以广泛概述眼部生物识别学中的挑战。
translated by 谷歌翻译
The Elo algorithm, due to its simplicity, is widely used for rating in sports competitions as well as in other applications where the rating/ranking is a useful tool for predicting future results. However, despite its widespread use, a detailed understanding of the convergence properties of the Elo algorithm is still lacking. Aiming to fill this gap, this paper presents a comprehensive (stochastic) analysis of the Elo algorithm, considering round-robin (one-on-one) competitions. Specifically, analytical expressions are derived characterizing the behavior/evolution of the skills and of important performance metrics. Then, taking into account the relationship between the behavior of the algorithm and the step-size value, which is a hyperparameter that can be controlled, some design guidelines as well as discussions about the performance of the algorithm are provided. To illustrate the applicability of the theoretical findings, experimental results are shown, corroborating the very good match between analytical predictions and those obtained from the algorithm using real-world data (from the Italian SuperLega, Volleyball League).
translated by 谷歌翻译
We describe a Physics-Informed Neural Network (PINN) that simulates the flow induced by the astronomical tide in a synthetic port channel, with dimensions based on the Santos - S\~ao Vicente - Bertioga Estuarine System. PINN models aim to combine the knowledge of physical systems and data-driven machine learning models. This is done by training a neural network to minimize the residuals of the governing equations in sample points. In this work, our flow is governed by the Navier-Stokes equations with some approximations. There are two main novelties in this paper. First, we design our model to assume that the flow is periodic in time, which is not feasible in conventional simulation methods. Second, we evaluate the benefit of resampling the function evaluation points during training, which has a near zero computational cost and has been verified to improve the final model, especially for small batch sizes. Finally, we discuss some limitations of the approximations used in the Navier-Stokes equations regarding the modeling of turbulence and how it interacts with PINNs.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Text classification is a natural language processing (NLP) task relevant to many commercial applications, like e-commerce and customer service. Naturally, classifying such excerpts accurately often represents a challenge, due to intrinsic language aspects, like irony and nuance. To accomplish this task, one must provide a robust numerical representation for documents, a process known as embedding. Embedding represents a key NLP field nowadays, having faced a significant advance in the last decade, especially after the introduction of the word-to-vector concept and the popularization of Deep Learning models for solving NLP tasks, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Transformer-based Language Models (TLMs). Despite the impressive achievements in this field, the literature coverage regarding generating embeddings for Brazilian Portuguese texts is scarce, especially when considering commercial user reviews. Therefore, this work aims to provide a comprehensive experimental study of embedding approaches targeting a binary sentiment classification of user reviews in Brazilian Portuguese. This study includes from classical (Bag-of-Words) to state-of-the-art (Transformer-based) NLP models. The methods are evaluated with five open-source databases with pre-defined data partitions made available in an open digital repository to encourage reproducibility. The Fine-tuned TLMs achieved the best results for all cases, being followed by the Feature-based TLM, LSTM, and CNN, with alternate ranks, depending on the database under analysis.
translated by 谷歌翻译